Search CORE

University of Washington Structural Informatics Group Publications

The RNA Ontology (RNAO): An ontology for integrating RNA sequence and structure data

Author: Chris Mungall
Colin Batchelor
Craig Zirbel
Eric Westhof
Jane Richardson
Jesse Stombaugh
Karen Eilbeck
Neocles Leontis
Rob Knight
Thomas Bittner
Publication venue
Publication date: 06/08/2009
Field of study

Biomedical Ontologies are intended to integrate diverse biomedical data to enable intelligent data-mining and facilitate translation of basic research into useful clinical knowledge. We present the first version of RNAO, an ontology for integrating RNA 3D structural, biochemical and sequence data. While each 3D data file depicts the structure of a specific molecule, such data have broader significance as representatives of classes of homologous molecules, which, while differing in sequence, generally share core structural features of functional importance. Thus, 3D structure data gain value by being linked to homologous sequences in genomic data and databases of sequence alignments. Likewise genomic data can increase in value by annotation of shared structural features, especially when these can be linked to specific functions. The RNAO is being developed in line with the developing standards of the Open Biomedical Ontologies (OBO) Consortium

Nature Precedings

Recommended from our members

The Sequence Ontology: a tool for the unification of genome annotations.

Author: Ashburner Michael
Durbin Richard
Eilbeck Karen
Lewis Suzanna E
Mungall Christopher J
Stein Lincoln
Yandell Mark
Publication venue: Genome Biol
Publication date: 17/06/2011
Field of study

The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data. Because SO treats part-whole relationships rigorously, data described with it can become substrates for automated reasoning, and instances of sequence features described by the SO can be subjected to a group of logical operations termed extensional mereology operators.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Apollo (Cambridge)

The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration

Author: Ashburner Michael
Bard Jonathan
Bug William
Ceusters Werner
Eilbeck Karen
Goldberg Louis J
Ireland Amelia
Leontis Neocles
Lewissi Suzanna
Mungall Christopher J
OBI Consortium The
Rocca‐Serra a b
Rosse Cornelius
Ruttenberg Alan
Sansone Susanna-Assunta
Shah Migam
Smith Barry
Whetzel Patricia L
Publication venue
Publication date: 01/01/2007
Field of study

The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies, which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform, and new ontologies being created, on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved in the future

Pseudogene accumulation in the evolutionary histories of Salmonella enterica serovars Paratyphi A and Typhi

Author: Bason Nathalie
Bhutta Zulfiqar A
Dougan Gordon
Hasan Rumina
Holt Kathryn E
Langridge Gemma C
Mungall Karen
Norbertczak Halina
Parkhill Julian
Quail Michael A
Simmonds Mark
Thomson Nicholas R
Wain John
Walker Danielle
White Brian
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Of the > 2000 serovars of <it>Salmonella enterica </it>subspecies I, most cause self-limiting gastrointestinal disease in a wide range of mammalian hosts. However, <it>S. enterica </it>serovars Typhi and Paratyphi A are restricted to the human host and cause the similar systemic diseases typhoid and paratyphoid fever. Genome sequence similarity between Paratyphi A and Typhi has been attributed to convergent evolution via relatively recent recombination of a quarter of their genomes. The accumulation of pseudogenes is a key feature of these and other host-adapted pathogens, and overlapping pseudogene complements are evident in Paratyphi A and Typhi. Results We report the 4.5 Mbp genome of a clinical isolate of Paratyphi A, strain AKU_12601, completely sequenced using capillary techniques and subsequently checked using Illumina/Solexa resequencing. Comparison with the published genome of Paratyphi A ATCC9150 revealed the two are collinear and highly similar, with 188 single nucleotide polymorphisms and 39 insertions/deletions. A comparative analysis of pseudogene complements of these and two finished Typhi genomes (CT18, Ty2) identified several pseudogenes that had been overlooked in prior genome annotations of one or both serovars, and identified 66 pseudogenes shared between serovars. By determining whether each shared and serovar-specific pseudogene had been recombined between Paratyphi A and Typhi, we found evidence that most pseudogenes have accumulated after the recombination between serovars. We also divided pseudogenes into relative-time groups: ancestral pseudogenes inherited from a common ancestor, pseudogenes recombined between serovars which likely arose between initial divergence and later recombination, serovar-specific pseudogenes arising after recombination but prior to the last evolutionary bottlenecks in each population, and more recent strain-specific pseudogenes. Conclusion Recombination and pseudogene-formation have been important mechanisms of genetic convergence between Paratyphi A and Typhi, with most pseudogenes arising independently after extensive recombination between the serovars. The recombination events, along with divergence of and within each serovar, provide a relative time scale for pseudogene-forming mutations, affording rare insights into the progression of functional gene loss associated with host adaptation in <it>Salmonella</it>.</p

Springer - Publisher Connector

LSHTM Research Online

University of East Anglia digital repository

University of Melbourne Institutional Repository

The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081

Author: Achtman
Achtman
Andrews
Antonenko
Bell
Berriman
Bohin
Brendan W. Wren
Brubaker
Brussow
Carlin
Carol Churcher
Carver
Chain
Chen
Collyn
Collyn
Collyn
Cornelis
Cornelis
Darwin
Day
Deng
Foultier
Franke
Gibbons
Godde
Gordon Dougan
Gregory L. Challis
Grozdanov
Haft
Haller
Heesemann
Heesemann
Heesemann
Heidi Hauser
Hinchliffe
Howard
Iwobi
Jansen
Julian Parkhill
Kachlany
Karen Brooks
Karen Mungall
Kay Jagels
Kim
Lacroix
Lawrence
Lisa Crossman
Ljungberg
Maier
Maier
Maier
Mandy Sanders
Mark Maddison
Matthew T. G. Holden
McNally
Menon
Michael A. Quail
Michael B. Prentice
Mohd-Zain
Neyt
Nicholas R. Thomson
Ochman
Olson
Parkhill
Parkhill
Parkhill
Perna
Perry
Pickard
Planet
Portnoy
Porwollik
Pourcel
Prentice
Prentice
Ren
Rosqvist
Roth
Sally Whitehead
Sarah Howard
Schaible
Schreiner
Schubert
Schubert
Schubert
Sekowska
Sharon Moule
Slee
Snellings
Song
Theresa Feltwell
Thomson
Tracey Chillingworth
Van Noyen
Vignais
Wauters
Weissfeld
Welch
Wray
Wren
Young
Young
Zahra Abdellah
Zogaj
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2006
Field of study

The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the genome evolution of other human enteropathogens

Public Library of Science (PLOS)

LSHTM Research Online

Warwick Research Archives Portal Repository

University of Melbourne Institutional Repository

University of St. Andrews - Pure

St Andrews Research Repository

Annotation of two large contiguous regions from the Haemonchus contortus genome using RNA-seq and comparative analysis with Caenorhabditis elegans

Author: A Coghlan
A Coghlan
A Couthier
AJ Wolstenholme
Anna V. Protasio
C Liu
Clotilde K. S. Carlow
DB Guiliano
DL Laughton
DL Redmond
DP Knox
E Ghedin
E Redman
F Jackson
Frank Jackson
Gary Saunders
H Li
H Li
J Parkinson
J Spieth
JC Abbott
JH Graber
JL Bessereau
JM Ranz
John S. Gilleard
JR Vanfleteren
JS Gilleard
JS Gilleard
K Rutherford
Karen Mungall
L Duret
L Duret
L Rufener
LD Stein
LF LeJambre
LW Hillier
M Caceres
M Deutsch
Martin Hunt
Matthew Berriman
Michael Quail
MJ Callaghan
PS Chain
R Hoekstra
R Kaminsky
R Prichard
Robin Beech
Roz Laing
S Chen
S Leroy
Steven Laing
T Blumenthal
T Carver
TJ Carver
V Grillo
W Qian
Y Tanizawa
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 15/08/2011
Field of study

The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid nematode genomes

Public Library of Science (PLOS)

Public Library of Science (PLOS)

Enlighten

Telomeric expression sites are highly conserved in trypanosoma brucei

Author: Andrew Jackson
AP Jackson
Brian White
Carol Churcher
Christiane Hertz-Fowler
D Horn
D Salmon
Danielle Walker
David Harris
David Saunders
E Pays
E Pays
E Pays
E Wirtz
Edward J. Louis
FE Pryde
G Rudenko
G Rudenko
G Rudenko
George A. M. Cross
Gloria Rudenko
GS Lamont
H Hirumi
H Shimodaira
HC Mefford
Heidi Hauser
HG van Luenen
HV Xong
I Ansorge
Ian Goodhead
IC Florent
J. David Barry
JD Barry
JD Thompson
JE Haber
Jesse E. Taylor
JG Johnson
K Rutherford
K Scheffler
K Sheader
Karen Brooks
Karen Mungall
Kathy Seeger
KM Gottesdiener
KM Gottesdiener
L Marcello
L Vanhamme
Luisa M. Figueiredo
M Becker
M Berriman
M Berriman
M Cross
M Hoek
M Hoek
M Navarro
Magdalena Kartvelishvili
Mandy Sanders
Marion Becker
Matthew Berriman
MB Redpath
Michael A. Quail
MJ Ligtenberg
N Aitcheson
Nathalie Bason
ND Fedorova
Neil Hall
O Dreesen
O Dreesen
P Paindavoine
P Rice
Paul Heath
PJ Myler
RL Barnes
Rosanna Young
RW Morgan
S Callejas
S Guindon
Samah Fahkro
Sarah Sharp
SE Melville
SE Melville
SL Kosakovsky Pond
SL Pond
TC Bruen
V Leech
VB Carruthers
WS Wong
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/08/2008
Field of study

Subtelomeric regions are often under-represented in genome sequences of eukaryotes. One of the best known examples of the use of telomere proximity for adaptive purposes are the bloodstream expression sites (BESs) of the African trypanosome Trypanosoma brucei. To enhance our understanding of BES structure and function in host adaptation and immune evasion, the BES repertoire from the Lister 427 strain of T. brucei were independently tagged and sequenced. BESs are polymorphic in size and structure but reveal a surprisingly conserved architecture in the context of extensive recombination. Very small BESs do exist and many functioning BESs do not contain the full complement of expression site associated genes (ESAGs). The consequences of duplicated or missing ESAGs, including ESAG9, a newly named ESAG12, and additional variant surface glycoprotein genes (VSGs) were evaluated by functional assays after BESs were tagged with a drug-resistance gene. Phylogenetic analysis of constituent ESAG families suggests that BESs are sequence mosaics and that extensive recombination has shaped the evolution of the BES repertoire. This work opens important perspectives in understanding the molecular mechanisms of antigenic variation, a widely used strategy for immune evasion in pathogens, and telomere biology